Overview

Dataset statistics

Number of variables25
Number of observations4274187
Missing cells40003182
Missing cells (%)37.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory815.2 MiB
Average record size in memory200.0 B

Variable types

Numeric4
DateTime2
Text9
Categorical10

Alerts

DRIVER_LICENSE_STATUS is highly imbalanced (86.2%)Imbalance
VEHICLE_DAMAGE_3 is highly imbalanced (53.7%)Imbalance
PUBLIC_PROPERTY_DAMAGE is highly imbalanced (63.2%)Imbalance
STATE_REGISTRATION has 321735 (7.5%) missing valuesMissing
VEHICLE_TYPE has 247609 (5.8%) missing valuesMissing
VEHICLE_MAKE has 1899905 (44.5%) missing valuesMissing
VEHICLE_MODEL has 4222807 (98.8%) missing valuesMissing
VEHICLE_YEAR has 1921158 (44.9%) missing valuesMissing
TRAVEL_DIRECTION has 1673932 (39.2%) missing valuesMissing
VEHICLE_OCCUPANTS has 1793977 (42.0%) missing valuesMissing
DRIVER_SEX has 2252000 (52.7%) missing valuesMissing
DRIVER_LICENSE_STATUS has 2346168 (54.9%) missing valuesMissing
DRIVER_LICENSE_JURISDICTION has 2342040 (54.8%) missing valuesMissing
PRE_CRASH has 928192 (21.7%) missing valuesMissing
POINT_OF_IMPACT has 1707717 (40.0%) missing valuesMissing
VEHICLE_DAMAGE has 1733402 (40.6%) missing valuesMissing
VEHICLE_DAMAGE_1 has 2633928 (61.6%) missing valuesMissing
VEHICLE_DAMAGE_2 has 3034451 (71.0%) missing valuesMissing
VEHICLE_DAMAGE_3 has 3320488 (77.7%) missing valuesMissing
PUBLIC_PROPERTY_DAMAGE has 1528858 (35.8%) missing valuesMissing
PUBLIC_PROPERTY_DAMAGE_TYPE has 4246765 (99.4%) missing valuesMissing
CONTRIBUTING_FACTOR_1 has 153529 (3.6%) missing valuesMissing
CONTRIBUTING_FACTOR_2 has 1694521 (39.6%) missing valuesMissing
VEHICLE_YEAR is highly skewed (γ1 = 55.67830854)Skewed
VEHICLE_OCCUPANTS is highly skewed (γ1 = 980.7930925)Skewed
UNIQUE_ID has unique valuesUnique
VEHICLE_OCCUPANTS has 428660 (10.0%) zerosZeros

Reproduction

Analysis started2024-10-29 14:09:26.288214
Analysis finished2024-10-29 14:12:17.330572
Duration2 minutes and 51.04 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

UNIQUE_ID
Real number (ℝ)

UNIQUE 

Distinct4274187
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16643926
Minimum111711
Maximum20771083
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.6 MiB
2024-10-29T15:12:17.388361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum111711
5-th percentile9719068.3
Q114589050
median17595763
Q319526220
95-th percentile20508925
Maximum20771083
Range20659372
Interquartile range (IQR)4937170

Descriptive statistics

Standard deviation3367520.5
Coefficient of variation (CV)0.20232729
Kurtosis-0.36060851
Mean16643926
Median Absolute Deviation (MAD)2485262
Skewness-0.81830805
Sum7.1139251 × 1013
Variance1.1340194 × 1013
MonotonicityNot monotonic
2024-10-29T15:12:17.461083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10385780 1
 
< 0.1%
17862678 1
 
< 0.1%
18545668 1
 
< 0.1%
17112286 1
 
< 0.1%
17954158 1
 
< 0.1%
17500925 1
 
< 0.1%
17822072 1
 
< 0.1%
18387894 1
 
< 0.1%
17470706 1
 
< 0.1%
17717581 1
 
< 0.1%
Other values (4274177) 4274177
> 99.9%
ValueCountFrequency (%)
111711 1
< 0.1%
111712 1
< 0.1%
115530 1
< 0.1%
115531 1
< 0.1%
120620 1
< 0.1%
123422 1
< 0.1%
123423 1
< 0.1%
199289 1
< 0.1%
199290 1
< 0.1%
199291 1
< 0.1%
ValueCountFrequency (%)
20771083 1
< 0.1%
20771082 1
< 0.1%
20771081 1
< 0.1%
20771080 1
< 0.1%
20771079 1
< 0.1%
20771078 1
< 0.1%
20771077 1
< 0.1%
20771072 1
< 0.1%
20771071 1
< 0.1%
20771070 1
< 0.1%

COLLISION_ID
Real number (ℝ)

Distinct2127445
Distinct (%)49.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3213661.5
Minimum22
Maximum4766163
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.6 MiB
2024-10-29T15:12:17.539990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile110645
Q13174562.5
median3709203
Q34239995
95-th percentile4659439
Maximum4766163
Range4766141
Interquartile range (IQR)1065432.5

Descriptive statistics

Standard deviation1498334.6
Coefficient of variation (CV)0.46623909
Kurtosis0.10049334
Mean3213661.5
Median Absolute Deviation (MAD)532713
Skewness-1.2597238
Sum1.373579 × 1013
Variance2.2450066 × 1012
MonotonicityNot monotonic
2024-10-29T15:12:17.614093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4691158 42
 
< 0.1%
4539133 40
 
< 0.1%
4275782 25
 
< 0.1%
4324675 22
 
< 0.1%
3925685 22
 
< 0.1%
4541337 22
 
< 0.1%
306480 21
 
< 0.1%
4625450 20
 
< 0.1%
4578189 19
 
< 0.1%
3187017 19
 
< 0.1%
Other values (2127435) 4273935
> 99.9%
ValueCountFrequency (%)
22 2
< 0.1%
23 2
< 0.1%
24 2
< 0.1%
25 2
< 0.1%
26 2
< 0.1%
27 2
< 0.1%
28 2
< 0.1%
29 2
< 0.1%
30 2
< 0.1%
31 2
< 0.1%
ValueCountFrequency (%)
4766163 2
 
< 0.1%
4766160 2
 
< 0.1%
4766157 2
 
< 0.1%
4766156 2
 
< 0.1%
4766155 2
 
< 0.1%
4766154 2
 
< 0.1%
4766152 2
 
< 0.1%
4766151 1
 
< 0.1%
4766150 1
 
< 0.1%
4766148 6
< 0.1%
Distinct4497
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size32.6 MiB
Minimum2012-07-01 00:00:00
Maximum2024-10-22 00:00:00
2024-10-29T15:12:17.682426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:12:17.759840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1440
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size32.6 MiB
Minimum2024-10-29 00:00:00
Maximum2024-10-29 23:59:00
2024-10-29T15:12:17.835329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:12:17.908329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2745362
Distinct (%)64.2%
Missing0
Missing (%)0.0%
Memory size32.6 MiB
2024-10-29T15:12:19.070638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length36
Mean length20.863062
Min length1

Characters and Unicode

Total characters89172629
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2745343 ?
Unique (%)64.2%

Sample

1st row1
2nd row0553ab4d-9500-4cba-8d98-f4d7f89d5856
3rd row2
4th row1
5th row1
ValueCountFrequency (%)
1 769061
 
18.0%
2 694883
 
16.3%
3 50530
 
1.2%
4 10398
 
0.2%
5 2608
 
0.1%
6 791
 
< 0.1%
7 281
 
< 0.1%
8 130
 
< 0.1%
9 69
 
< 0.1%
10 36
 
< 0.1%
Other values (2745352) 2745400
64.2%
2024-10-29T15:12:20.126557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 9481656
 
10.6%
4 7045658
 
7.9%
1 5540470
 
6.2%
2 5379107
 
6.0%
8 5255399
 
5.9%
9 5252223
 
5.9%
b 5041156
 
5.7%
a 5033437
 
5.6%
3 4717854
 
5.3%
5 4667774
 
5.2%
Other values (7) 31757895
35.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 51835473
58.1%
Lowercase Letter 27855500
31.2%
Dash Punctuation 9481656
 
10.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 7045658
13.6%
1 5540470
10.7%
2 5379107
10.4%
8 5255399
10.1%
9 5252223
10.1%
3 4717854
9.1%
5 4667774
9.0%
6 4660160
9.0%
0 4658578
9.0%
7 4658250
9.0%
Lowercase Letter
ValueCountFrequency (%)
b 5041156
18.1%
a 5033437
18.1%
e 4448371
16.0%
f 4445150
16.0%
c 4443797
16.0%
d 4443589
16.0%
Dash Punctuation
ValueCountFrequency (%)
- 9481656
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 61317129
68.8%
Latin 27855500
31.2%

Most frequent character per script

Common
ValueCountFrequency (%)
- 9481656
15.5%
4 7045658
11.5%
1 5540470
9.0%
2 5379107
8.8%
8 5255399
8.6%
9 5252223
8.6%
3 4717854
7.7%
5 4667774
7.6%
6 4660160
7.6%
0 4658578
7.6%
Latin
ValueCountFrequency (%)
b 5041156
18.1%
a 5033437
18.1%
e 4448371
16.0%
f 4445150
16.0%
c 4443797
16.0%
d 4443589
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 89172629
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 9481656
 
10.6%
4 7045658
 
7.9%
1 5540470
 
6.2%
2 5379107
 
6.0%
8 5255399
 
5.9%
9 5252223
 
5.9%
b 5041156
 
5.7%
a 5033437
 
5.6%
3 4717854
 
5.3%
5 4667774
 
5.2%
Other values (7) 31757895
35.6%

STATE_REGISTRATION
Text

MISSING 

Distinct82
Distinct (%)< 0.1%
Missing321735
Missing (%)7.5%
Memory size32.6 MiB
2024-10-29T15:12:20.205436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.9999997
Min length1

Characters and Unicode

Total characters7904903
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowNY
2nd rowNY
3rd rowNY
4th rowNY
5th rowNY
ValueCountFrequency (%)
ny 3287970
83.2%
nj 241929
 
6.1%
pa 89407
 
2.3%
fl 49062
 
1.2%
ct 44851
 
1.1%
va 19812
 
0.5%
ma 18723
 
0.5%
md 18474
 
0.5%
nc 17548
 
0.4%
ga 14760
 
0.4%
Other values (72) 149916
 
3.8%
2024-10-29T15:12:20.345316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 3575709
45.2%
Y 3289221
41.6%
J 241929
 
3.1%
A 165299
 
2.1%
P 90443
 
1.1%
C 78150
 
1.0%
T 67505
 
0.9%
L 63740
 
0.8%
M 50650
 
0.6%
F 49486
 
0.6%
Other values (16) 232771
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7904903
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 3575709
45.2%
Y 3289221
41.6%
J 241929
 
3.1%
A 165299
 
2.1%
P 90443
 
1.1%
C 78150
 
1.0%
T 67505
 
0.9%
L 63740
 
0.8%
M 50650
 
0.6%
F 49486
 
0.6%
Other values (16) 232771
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 7904903
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 3575709
45.2%
Y 3289221
41.6%
J 241929
 
3.1%
A 165299
 
2.1%
P 90443
 
1.1%
C 78150
 
1.0%
T 67505
 
0.9%
L 63740
 
0.8%
M 50650
 
0.6%
F 49486
 
0.6%
Other values (16) 232771
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7904903
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 3575709
45.2%
Y 3289221
41.6%
J 241929
 
3.1%
A 165299
 
2.1%
P 90443
 
1.1%
C 78150
 
1.0%
T 67505
 
0.9%
L 63740
 
0.8%
M 50650
 
0.6%
F 49486
 
0.6%
Other values (16) 232771
 
2.9%

VEHICLE_TYPE
Text

MISSING 

Distinct2856
Distinct (%)0.1%
Missing247609
Missing (%)5.8%
Memory size32.6 MiB
2024-10-29T15:12:20.440137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length30
Mean length16.558065
Min length1

Characters and Unicode

Total characters66672340
Distinct characters75
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1733 ?
Unique (%)< 0.1%

Sample

1st rowPASSENGER VEHICLE
2nd rowStation Wagon/Sport Utility Vehicle
3rd rowTAXI
4th rowPASSENGER VEHICLE
5th rowPASSENGER VEHICLE
ValueCountFrequency (%)
vehicle 1652165
17.6%
utility 1199705
12.8%
station 1199628
12.8%
sedan 1159968
12.4%
wagon/sport 861699
9.2%
passenger 770780
8.2%
340846
 
3.6%
wagon 338052
 
3.6%
sport 337927
 
3.6%
truck 182549
 
1.9%
Other values (1504) 1332960
14.2%
2024-10-29T15:12:20.617566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5349701
 
8.0%
S 5141530
 
7.7%
t 4381445
 
6.6%
i 3715221
 
5.6%
E 3410448
 
5.1%
e 3086621
 
4.6%
a 3065069
 
4.6%
n 2927200
 
4.4%
o 2755960
 
4.1%
T 2170644
 
3.3%
Other values (65) 30668501
46.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29908670
44.9%
Uppercase Letter 29852441
44.8%
Space Separator 5349701
 
8.0%
Other Punctuation 1202688
 
1.8%
Decimal Number 134918
 
0.2%
Dash Punctuation 113331
 
0.2%
Open Punctuation 55299
 
0.1%
Close Punctuation 55287
 
0.1%
Modifier Symbol 4
 
< 0.1%
Control 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 5141530
17.2%
E 3410448
11.4%
T 2170644
 
7.3%
I 1987061
 
6.7%
N 1819684
 
6.1%
V 1797120
 
6.0%
A 1636991
 
5.5%
U 1386891
 
4.6%
R 1361548
 
4.6%
W 1308892
 
4.4%
Other values (16) 7831632
26.2%
Lowercase Letter
ValueCountFrequency (%)
t 4381445
14.6%
i 3715221
12.4%
e 3086621
10.3%
a 3065069
10.2%
n 2927200
9.8%
o 2755960
9.2%
l 1797968
 
6.0%
d 1252683
 
4.2%
r 1219100
 
4.1%
c 1173762
 
3.9%
Other values (15) 4533641
15.2%
Decimal Number
ValueCountFrequency (%)
4 100321
74.4%
6 28621
 
21.2%
2 4896
 
3.6%
3 694
 
0.5%
1 132
 
0.1%
0 112
 
0.1%
5 83
 
0.1%
9 28
 
< 0.1%
8 19
 
< 0.1%
7 12
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/ 1202629
> 99.9%
. 29
 
< 0.1%
# 10
 
< 0.1%
, 6
 
< 0.1%
? 5
 
< 0.1%
' 4
 
< 0.1%
\ 3
 
< 0.1%
& 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
5349701
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 113331
100.0%
Open Punctuation
ValueCountFrequency (%)
( 55299
100.0%
Close Punctuation
ValueCountFrequency (%)
) 55287
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 59761111
89.6%
Common 6911229
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 5141530
 
8.6%
t 4381445
 
7.3%
i 3715221
 
6.2%
E 3410448
 
5.7%
e 3086621
 
5.2%
a 3065069
 
5.1%
n 2927200
 
4.9%
o 2755960
 
4.6%
T 2170644
 
3.6%
I 1987061
 
3.3%
Other values (41) 27119912
45.4%
Common
ValueCountFrequency (%)
5349701
77.4%
/ 1202629
 
17.4%
- 113331
 
1.6%
4 100321
 
1.5%
( 55299
 
0.8%
) 55287
 
0.8%
6 28621
 
0.4%
2 4896
 
0.1%
3 694
 
< 0.1%
1 132
 
< 0.1%
Other values (14) 318
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66672340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5349701
 
8.0%
S 5141530
 
7.7%
t 4381445
 
6.6%
i 3715221
 
5.6%
E 3410448
 
5.1%
e 3086621
 
4.6%
a 3065069
 
4.6%
n 2927200
 
4.4%
o 2755960
 
4.1%
T 2170644
 
3.3%
Other values (65) 30668501
46.0%

VEHICLE_MAKE
Text

MISSING 

Distinct13416
Distinct (%)0.6%
Missing1899905
Missing (%)44.5%
Memory size32.6 MiB
2024-10-29T15:12:20.777973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length13
Mean length12.687445
Min length1

Characters and Unicode

Total characters30123573
Distinct characters80
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9390 ?
Unique (%)0.4%

Sample

1st rowTOYT -CAR/SUV
2nd rowMERZ -CAR/SUV
3rd rowFRHT-TRUCK/BUS
4th rowFORD -CAR/SUV
5th rowVOLK -CAR/SUV
ValueCountFrequency (%)
car/suv 2134682
46.8%
toyt 405262
 
8.9%
hond 295179
 
6.5%
niss 239360
 
5.3%
ford 206558
 
4.5%
chev 113785
 
2.5%
hyun 84478
 
1.9%
bmw 80804
 
1.8%
merz 78675
 
1.7%
jeep 77851
 
1.7%
Other values (7080) 840575
 
18.4%
2024-10-29T15:12:21.025879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 2901524
9.6%
R 2746430
9.1%
U 2636985
 
8.8%
C 2599265
 
8.6%
A 2381201
 
7.9%
V 2326931
 
7.7%
- 2280689
 
7.6%
/ 2269508
 
7.5%
2182927
 
7.2%
O 1091946
 
3.6%
Other values (70) 6706167
22.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 23226799
77.1%
Dash Punctuation 2280689
 
7.6%
Other Punctuation 2270323
 
7.5%
Space Separator 2182927
 
7.2%
Lowercase Letter 158301
 
0.5%
Decimal Number 4085
 
< 0.1%
Open Punctuation 224
 
< 0.1%
Close Punctuation 218
 
< 0.1%
Math Symbol 3
 
< 0.1%
Control 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 2901524
12.5%
R 2746430
11.8%
U 2636985
11.4%
C 2599265
11.2%
A 2381201
10.3%
V 2326931
10.0%
O 1091946
 
4.7%
T 1046843
 
4.5%
N 775895
 
3.3%
D 749148
 
3.2%
Other values (16) 3970631
17.1%
Lowercase Letter
ValueCountFrequency (%)
r 16867
 
10.7%
e 15502
 
9.8%
n 14399
 
9.1%
o 14363
 
9.1%
i 12105
 
7.6%
a 11003
 
7.0%
t 9440
 
6.0%
l 7275
 
4.6%
u 5933
 
3.7%
c 5677
 
3.6%
Other values (16) 45737
28.9%
Other Punctuation
ValueCountFrequency (%)
/ 2269508
> 99.9%
. 451
 
< 0.1%
, 174
 
< 0.1%
\ 64
 
< 0.1%
# 34
 
< 0.1%
& 32
 
< 0.1%
' 28
 
< 0.1%
? 17
 
< 0.1%
; 11
 
< 0.1%
: 4
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 944
23.1%
1 623
15.3%
5 575
14.1%
9 523
12.8%
2 394
9.6%
3 280
 
6.9%
7 245
 
6.0%
4 219
 
5.4%
6 171
 
4.2%
8 111
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 2280689
100.0%
Space Separator
ValueCountFrequency (%)
2182927
100.0%
Open Punctuation
ValueCountFrequency (%)
( 224
100.0%
Close Punctuation
ValueCountFrequency (%)
) 218
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%
Control
ValueCountFrequency (%)
 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23385100
77.6%
Common 6738473
 
22.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 2901524
12.4%
R 2746430
11.7%
U 2636985
11.3%
C 2599265
11.1%
A 2381201
10.2%
V 2326931
10.0%
O 1091946
 
4.7%
T 1046843
 
4.5%
N 775895
 
3.3%
D 749148
 
3.2%
Other values (42) 4128932
17.7%
Common
ValueCountFrequency (%)
- 2280689
33.8%
/ 2269508
33.7%
2182927
32.4%
0 944
 
< 0.1%
1 623
 
< 0.1%
5 575
 
< 0.1%
9 523
 
< 0.1%
. 451
 
< 0.1%
2 394
 
< 0.1%
3 280
 
< 0.1%
Other values (18) 1559
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30123573
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 2901524
9.6%
R 2746430
9.1%
U 2636985
 
8.8%
C 2599265
 
8.6%
A 2381201
 
7.9%
V 2326931
 
7.7%
- 2280689
 
7.6%
/ 2269508
 
7.5%
2182927
 
7.2%
O 1091946
 
3.6%
Other values (70) 6706167
22.3%

VEHICLE_MODEL
Text

MISSING 

Distinct2429
Distinct (%)4.7%
Missing4222807
Missing (%)98.8%
Memory size32.6 MiB
2024-10-29T15:12:21.196813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length8
Mean length7.5591086
Min length1

Characters and Unicode

Total characters388387
Distinct characters73
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1327 ?
Unique (%)2.6%

Sample

1st rowTOYT 4RN
2nd rowFORD ZZZ
3rd rowTRUCK TRADE
4th rowDODG CHA
5th rowtown and country
ValueCountFrequency (%)
zzz 9213
 
9.7%
toyt 8644
 
9.1%
hond 5999
 
6.3%
niss 5220
 
5.5%
ford 4930
 
5.2%
cam 3092
 
3.3%
chev 2681
 
2.8%
acc 1899
 
2.0%
hyun 1575
 
1.7%
alt 1532
 
1.6%
Other values (1769) 50052
52.8%
2024-10-29T15:12:21.440185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
43457
 
11.2%
Z 32695
 
8.4%
T 27048
 
7.0%
O 25820
 
6.6%
C 22245
 
5.7%
N 21375
 
5.5%
S 18775
 
4.8%
A 17553
 
4.5%
D 17438
 
4.5%
R 16184
 
4.2%
Other values (63) 145797
37.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 323041
83.2%
Space Separator 43457
 
11.2%
Decimal Number 12683
 
3.3%
Lowercase Letter 9082
 
2.3%
Dash Punctuation 54
 
< 0.1%
Other Punctuation 54
 
< 0.1%
Open Punctuation 8
 
< 0.1%
Close Punctuation 7
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Z 32695
 
10.1%
T 27048
 
8.4%
O 25820
 
8.0%
C 22245
 
6.9%
N 21375
 
6.6%
S 18775
 
5.8%
A 17553
 
5.4%
D 17438
 
5.4%
R 16184
 
5.0%
H 14460
 
4.5%
Other values (16) 109448
33.9%
Lowercase Letter
ValueCountFrequency (%)
n 995
 
11.0%
t 765
 
8.4%
a 748
 
8.2%
u 723
 
8.0%
s 649
 
7.1%
e 628
 
6.9%
r 600
 
6.6%
o 525
 
5.8%
c 458
 
5.0%
k 447
 
4.9%
Other values (16) 2544
28.0%
Decimal Number
ValueCountFrequency (%)
3 3067
24.2%
5 3005
23.7%
0 2394
18.9%
2 1344
10.6%
4 855
 
6.7%
1 528
 
4.2%
8 447
 
3.5%
7 398
 
3.1%
6 391
 
3.1%
9 254
 
2.0%
Other Punctuation
ValueCountFrequency (%)
/ 25
46.3%
. 15
27.8%
, 6
 
11.1%
? 5
 
9.3%
' 2
 
3.7%
\ 1
 
1.9%
Space Separator
ValueCountFrequency (%)
43457
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 332123
85.5%
Common 56264
 
14.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
Z 32695
 
9.8%
T 27048
 
8.1%
O 25820
 
7.8%
C 22245
 
6.7%
N 21375
 
6.4%
S 18775
 
5.7%
A 17553
 
5.3%
D 17438
 
5.3%
R 16184
 
4.9%
H 14460
 
4.4%
Other values (42) 118530
35.7%
Common
ValueCountFrequency (%)
43457
77.2%
3 3067
 
5.5%
5 3005
 
5.3%
0 2394
 
4.3%
2 1344
 
2.4%
4 855
 
1.5%
1 528
 
0.9%
8 447
 
0.8%
7 398
 
0.7%
6 391
 
0.7%
Other values (11) 378
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 388387
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
43457
 
11.2%
Z 32695
 
8.4%
T 27048
 
7.0%
O 25820
 
6.6%
C 22245
 
5.7%
N 21375
 
5.5%
S 18775
 
4.8%
A 17553
 
4.5%
D 17438
 
4.5%
R 16184
 
4.2%
Other values (63) 145797
37.5%

VEHICLE_YEAR
Real number (ℝ)

MISSING  SKEWED 

Distinct321
Distinct (%)< 0.1%
Missing1921158
Missing (%)44.9%
Infinite0
Infinite (%)0.0%
Mean2015.2373
Minimum1000
Maximum20063
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.6 MiB
2024-10-29T15:12:21.535437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile2001
Q12008
median2014
Q32017
95-th percentile2020
Maximum20063
Range19063
Interquartile range (IQR)9

Descriptive statistics

Standard deviation147.37309
Coefficient of variation (CV)0.073129398
Kurtosis3310.8364
Mean2015.2373
Median Absolute Deviation (MAD)4
Skewness55.678309
Sum4.7419118 × 109
Variance21718.827
MonotonicityNot monotonic
2024-10-29T15:12:21.609405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2016 224770
 
5.3%
2015 222458
 
5.2%
2017 204686
 
4.8%
2014 164096
 
3.8%
2018 141693
 
3.3%
2013 140432
 
3.3%
2012 112251
 
2.6%
2019 101743
 
2.4%
2011 100797
 
2.4%
2007 91769
 
2.1%
Other values (311) 848334
19.8%
(Missing) 1921158
44.9%
ValueCountFrequency (%)
1000 1
 
< 0.1%
1111 2
 
< 0.1%
1900 7
< 0.1%
1920 2
 
< 0.1%
1921 1
 
< 0.1%
1923 1
 
< 0.1%
1926 1
 
< 0.1%
1930 1
 
< 0.1%
1931 1
 
< 0.1%
1932 1
 
< 0.1%
ValueCountFrequency (%)
20063 1
 
< 0.1%
20015 2
 
< 0.1%
20009 1
 
< 0.1%
20003 1
 
< 0.1%
19969 1
 
< 0.1%
9999 741
< 0.1%
9972 1
 
< 0.1%
9699 1
 
< 0.1%
9019 1
 
< 0.1%
8888 1
 
< 0.1%

TRAVEL_DIRECTION
Categorical

MISSING 

Distinct15
Distinct (%)< 0.1%
Missing1673932
Missing (%)39.2%
Memory size32.6 MiB
West
596799 
North
595515 
East
595036 
South
588175 
Unknown
86834 
Other values (10)
137896 

Length

Max length9
Median length7
Mean length4.8151939
Min length1

Characters and Unicode

Total characters12520732
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNorth
2nd rowEast
3rd rowEast
4th rowSouthwest
5th rowSouth

Common Values

ValueCountFrequency (%)
West 596799
 
14.0%
North 595515
 
13.9%
East 595036
 
13.9%
South 588175
 
13.8%
Unknown 86834
 
2.0%
Northeast 36270
 
0.8%
Southeast 34395
 
0.8%
Southwest 33640
 
0.8%
Northwest 31846
 
0.7%
- 1003
 
< 0.1%
Other values (5) 742
 
< 0.1%
(Missing) 1673932
39.2%

Length

2024-10-29T15:12:21.684660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
west 596799
23.0%
north 595515
22.9%
east 595036
22.9%
south 588175
22.6%
unknown 86834
 
3.3%
northeast 36270
 
1.4%
southeast 34395
 
1.3%
southwest 33640
 
1.3%
northwest 31846
 
1.2%
1003
 
< 0.1%
Other values (5) 742
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
t 2647827
21.1%
o 1406675
11.2%
s 1327986
10.6%
h 1319841
10.5%
e 732950
 
5.9%
a 665701
 
5.3%
N 663808
 
5.3%
r 663631
 
5.3%
S 656416
 
5.2%
u 656210
 
5.2%
Other values (7) 1779687
14.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9920477
79.2%
Uppercase Letter 2599252
 
20.8%
Dash Punctuation 1003
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2647827
26.7%
o 1406675
14.2%
s 1327986
13.4%
h 1319841
13.3%
e 732950
 
7.4%
a 665701
 
6.7%
r 663631
 
6.7%
u 656210
 
6.6%
n 260502
 
2.6%
w 152320
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
N 663808
25.5%
S 656416
25.3%
W 596972
23.0%
E 595195
22.9%
U 86861
 
3.3%
Dash Punctuation
ValueCountFrequency (%)
- 1003
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12519729
> 99.9%
Common 1003
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 2647827
21.1%
o 1406675
11.2%
s 1327986
10.6%
h 1319841
10.5%
e 732950
 
5.9%
a 665701
 
5.3%
N 663808
 
5.3%
r 663631
 
5.3%
S 656416
 
5.2%
u 656210
 
5.2%
Other values (6) 1778684
14.2%
Common
ValueCountFrequency (%)
- 1003
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12520732
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 2647827
21.1%
o 1406675
11.2%
s 1327986
10.6%
h 1319841
10.5%
e 732950
 
5.9%
a 665701
 
5.3%
N 663808
 
5.3%
r 663631
 
5.3%
S 656416
 
5.2%
u 656210
 
5.2%
Other values (7) 1779687
14.2%

VEHICLE_OCCUPANTS
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct135
Distinct (%)< 0.1%
Missing1793977
Missing (%)42.0%
Infinite0
Infinite (%)0.0%
Mean1104.1254
Minimum0
Maximum1 × 109
Zeros428660
Zeros (%)10.0%
Negative0
Negative (%)0.0%
Memory size32.6 MiB
2024-10-29T15:12:21.756992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile3
Maximum1 × 109
Range1 × 109
Interquartile range (IQR)0

Descriptive statistics

Standard deviation944251.22
Coefficient of variation (CV)855.2029
Kurtosis1001332.9
Mean1104.1254
Median Absolute Deviation (MAD)0
Skewness980.79309
Sum2.7384628 × 109
Variance8.9161037 × 1011
MonotonicityNot monotonic
2024-10-29T15:12:21.831819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1562334
36.6%
0 428660
 
10.0%
2 328343
 
7.7%
3 94491
 
2.2%
4 39164
 
0.9%
5 14353
 
0.3%
6 4748
 
0.1%
7 2093
 
< 0.1%
8 1251
 
< 0.1%
9 875
 
< 0.1%
Other values (125) 3898
 
0.1%
(Missing) 1793977
42.0%
ValueCountFrequency (%)
0 428660
 
10.0%
1 1562334
36.6%
2 328343
 
7.7%
3 94491
 
2.2%
4 39164
 
0.9%
5 14353
 
0.3%
6 4748
 
0.1%
7 2093
 
< 0.1%
8 1251
 
< 0.1%
9 875
 
< 0.1%
ValueCountFrequency (%)
999999999 1
 
< 0.1%
981990849 1
 
< 0.1%
456817715 1
 
< 0.1%
167820107 1
 
< 0.1%
99999999 1
 
< 0.1%
9999999 2
< 0.1%
5292023 1
 
< 0.1%
999999 3
< 0.1%
99999 4
< 0.1%
24260 1
 
< 0.1%

DRIVER_SEX
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing2252000
Missing (%)52.7%
Memory size32.6 MiB
M
1496794 
F
516924 
U
 
8469

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2022187
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
M 1496794
35.0%
F 516924
 
12.1%
U 8469
 
0.2%
(Missing) 2252000
52.7%

Length

2024-10-29T15:12:21.899425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-29T15:12:21.953473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
m 1496794
74.0%
f 516924
 
25.6%
u 8469
 
0.4%

Most occurring characters

ValueCountFrequency (%)
M 1496794
74.0%
F 516924
 
25.6%
U 8469
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2022187
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1496794
74.0%
F 516924
 
25.6%
U 8469
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 2022187
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1496794
74.0%
F 516924
 
25.6%
U 8469
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2022187
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1496794
74.0%
F 516924
 
25.6%
U 8469
 
0.4%

DRIVER_LICENSE_STATUS
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing2346168
Missing (%)54.9%
Memory size32.6 MiB
Licensed
1870844 
Unlicensed
 
39376
Permit
 
17799

Length

Max length10
Median length8
Mean length8.0223826
Min length6

Characters and Unicode

Total characters15467306
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLicensed
2nd rowLicensed
3rd rowLicensed
4th rowLicensed
5th rowLicensed

Common Values

ValueCountFrequency (%)
Licensed 1870844
43.8%
Unlicensed 39376
 
0.9%
Permit 17799
 
0.4%
(Missing) 2346168
54.9%

Length

2024-10-29T15:12:22.021027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-29T15:12:22.078490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
licensed 1870844
97.0%
unlicensed 39376
 
2.0%
permit 17799
 
0.9%

Most occurring characters

ValueCountFrequency (%)
e 3838239
24.8%
n 1949596
12.6%
i 1928019
12.5%
c 1910220
12.4%
s 1910220
12.4%
d 1910220
12.4%
L 1870844
12.1%
U 39376
 
0.3%
l 39376
 
0.3%
P 17799
 
0.1%
Other values (3) 53397
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13539287
87.5%
Uppercase Letter 1928019
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3838239
28.3%
n 1949596
14.4%
i 1928019
14.2%
c 1910220
14.1%
s 1910220
14.1%
d 1910220
14.1%
l 39376
 
0.3%
r 17799
 
0.1%
m 17799
 
0.1%
t 17799
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
L 1870844
97.0%
U 39376
 
2.0%
P 17799
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 15467306
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3838239
24.8%
n 1949596
12.6%
i 1928019
12.5%
c 1910220
12.4%
s 1910220
12.4%
d 1910220
12.4%
L 1870844
12.1%
U 39376
 
0.3%
l 39376
 
0.3%
P 17799
 
0.1%
Other values (3) 53397
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15467306
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3838239
24.8%
n 1949596
12.6%
i 1928019
12.5%
c 1910220
12.4%
s 1910220
12.4%
d 1910220
12.4%
L 1870844
12.1%
U 39376
 
0.3%
l 39376
 
0.3%
P 17799
 
0.1%
Other values (3) 53397
 
0.3%
Distinct72
Distinct (%)< 0.1%
Missing2342040
Missing (%)54.8%
Memory size32.6 MiB
2024-10-29T15:12:22.140214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length2
Mean length2.0027783
Min length2

Characters and Unicode

Total characters3869662
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowNY
2nd rowFL
3rd rowNY
4th rowNY
5th rowNY
ValueCountFrequency (%)
ny 1664726
86.2%
nj 109625
 
5.7%
pa 34490
 
1.8%
ct 21443
 
1.1%
fl 20525
 
1.1%
md 10981
 
0.6%
nc 6845
 
0.4%
ma 6459
 
0.3%
ga 6446
 
0.3%
va 6352
 
0.3%
Other values (61) 44255
 
2.3%
2024-10-29T15:12:22.287157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1788277
46.2%
Y 1665131
43.0%
J 109626
 
2.8%
A 64075
 
1.7%
C 36636
 
0.9%
P 35152
 
0.9%
T 26327
 
0.7%
L 23673
 
0.6%
M 22375
 
0.6%
F 20862
 
0.5%
Other values (20) 77528
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3869563
> 99.9%
Decimal Number 96
 
< 0.1%
Other Punctuation 2
 
< 0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 1788277
46.2%
Y 1665131
43.0%
J 109626
 
2.8%
A 64075
 
1.7%
C 36636
 
0.9%
P 35152
 
0.9%
T 26327
 
0.7%
L 23673
 
0.6%
M 22375
 
0.6%
F 20862
 
0.5%
Other values (16) 77429
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
' 1
50.0%
Decimal Number
ValueCountFrequency (%)
1 96
100.0%
Lowercase Letter
ValueCountFrequency (%)
q 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3869564
> 99.9%
Common 98
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1788277
46.2%
Y 1665131
43.0%
J 109626
 
2.8%
A 64075
 
1.7%
C 36636
 
0.9%
P 35152
 
0.9%
T 26327
 
0.7%
L 23673
 
0.6%
M 22375
 
0.6%
F 20862
 
0.5%
Other values (17) 77430
 
2.0%
Common
ValueCountFrequency (%)
1 96
98.0%
, 1
 
1.0%
' 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3869662
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1788277
46.2%
Y 1665131
43.0%
J 109626
 
2.8%
A 64075
 
1.7%
C 36636
 
0.9%
P 35152
 
0.9%
T 26327
 
0.7%
L 23673
 
0.6%
M 22375
 
0.6%
F 20862
 
0.5%
Other values (20) 77528
 
2.0%

PRE_CRASH
Categorical

MISSING 

Distinct19
Distinct (%)< 0.1%
Missing928192
Missing (%)21.7%
Memory size32.6 MiB
Going Straight Ahead
1640811 
Parked
578642 
Making Left Turn
206267 
Making Right Turn
169138 
Stopped in Traffic
 
152990
Other values (14)
598147 

Length

Max length26
Median length24
Mean length15.949931
Min length6

Characters and Unicode

Total characters53368388
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGoing Straight Ahead
2nd rowGoing Straight Ahead
3rd rowParked
4th rowMerging
5th rowParked

Common Values

ValueCountFrequency (%)
Going Straight Ahead 1640811
38.4%
Parked 578642
 
13.5%
Making Left Turn 206267
 
4.8%
Making Right Turn 169138
 
4.0%
Stopped in Traffic 152990
 
3.6%
Slowing or Stopping 117145
 
2.7%
Backing 113157
 
2.6%
Changing Lanes 97298
 
2.3%
Merging 54183
 
1.3%
Starting from Parking 54176
 
1.3%
Other values (9) 162188
 
3.8%
(Missing) 928192
21.7%

Length

2024-10-29T15:12:22.375148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
going 1640811
19.7%
straight 1640811
19.7%
ahead 1640811
19.7%
parked 620199
 
7.5%
making 406847
 
4.9%
turn 406847
 
4.9%
left 207287
 
2.5%
right 170040
 
2.0%
in 169825
 
2.0%
traffic 166185
 
2.0%
Other values (23) 1243476
15.0%

Most occurring characters

ValueCountFrequency (%)
i 4985275
 
9.3%
4967144
 
9.3%
a 4946483
 
9.3%
g 4710712
 
8.8%
t 4187884
 
7.8%
n 3604684
 
6.8%
h 3584624
 
6.7%
r 3259954
 
6.1%
e 2857191
 
5.4%
d 2423202
 
4.5%
Other values (28) 13841235
25.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40395509
75.7%
Uppercase Letter 7970071
 
14.9%
Space Separator 4967144
 
9.3%
Other Punctuation 35664
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4985275
12.3%
a 4946483
12.2%
g 4710712
11.7%
t 4187884
10.4%
n 3604684
8.9%
h 3584624
8.9%
r 3259954
8.1%
e 2857191
7.1%
d 2423202
6.0%
o 2293368
5.7%
Other values (13) 3542132
8.8%
Uppercase Letter
ValueCountFrequency (%)
S 2095462
26.3%
A 1644451
20.6%
G 1640811
20.6%
P 754262
 
9.5%
T 573032
 
7.2%
M 461030
 
5.8%
L 304585
 
3.8%
R 175602
 
2.2%
B 113157
 
1.4%
C 97298
 
1.2%
Other values (3) 110381
 
1.4%
Space Separator
ValueCountFrequency (%)
4967144
100.0%
Other Punctuation
ValueCountFrequency (%)
* 35664
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48365580
90.6%
Common 5002808
 
9.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4985275
10.3%
a 4946483
10.2%
g 4710712
9.7%
t 4187884
 
8.7%
n 3604684
 
7.5%
h 3584624
 
7.4%
r 3259954
 
6.7%
e 2857191
 
5.9%
d 2423202
 
5.0%
o 2293368
 
4.7%
Other values (26) 11512203
23.8%
Common
ValueCountFrequency (%)
4967144
99.3%
* 35664
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53368388
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4985275
 
9.3%
4967144
 
9.3%
a 4946483
 
9.3%
g 4710712
 
8.8%
t 4187884
 
7.8%
n 3604684
 
6.8%
h 3584624
 
6.7%
r 3259954
 
6.1%
e 2857191
 
5.4%
d 2423202
 
4.5%
Other values (28) 13841235
25.9%

POINT_OF_IMPACT
Categorical

MISSING 

Distinct19
Distinct (%)< 0.1%
Missing1707717
Missing (%)40.0%
Memory size32.6 MiB
Center Front End
447752 
Left Front Bumper
325169 
Center Back End
309316 
Right Front Bumper
286673 
Right Front Quarter Panel
182339 
Other values (14)
1015221 

Length

Max length25
Median length23
Mean length17.759805
Min length4

Characters and Unicode

Total characters45580007
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLeft Front Bumper
2nd rowRight Front Bumper
3rd rowLeft Front Quarter Panel
4th rowCenter Front End
5th rowRight Rear Bumper

Common Values

ValueCountFrequency (%)
Center Front End 447752
 
10.5%
Left Front Bumper 325169
 
7.6%
Center Back End 309316
 
7.2%
Right Front Bumper 286673
 
6.7%
Right Front Quarter Panel 182339
 
4.3%
Left Front Quarter Panel 179769
 
4.2%
Left Rear Quarter Panel 146880
 
3.4%
Left Side Doors 135730
 
3.2%
Left Rear Bumper 134669
 
3.2%
Right Side Doors 113249
 
2.6%
Other values (9) 304924
 
7.1%
(Missing) 1707717
40.0%

Length

2024-10-29T15:12:22.443677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
front 1421702
17.5%
left 922217
11.3%
bumper 835722
10.3%
right 775807
9.5%
center 757068
9.3%
end 757068
9.3%
quarter 613323
7.5%
panel 613323
7.5%
rear 475095
 
5.8%
back 309316
 
3.8%
Other values (10) 665140
8.2%

Most occurring characters

ValueCountFrequency (%)
5579311
12.2%
e 5334109
11.7%
r 5023527
 
11.0%
t 4533104
 
9.9%
n 3552695
 
7.8%
a 2130218
 
4.7%
o 1987298
 
4.4%
u 1450923
 
3.2%
F 1421702
 
3.1%
R 1256046
 
2.8%
Other values (24) 13311074
29.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 31854915
69.9%
Uppercase Letter 8145781
 
17.9%
Space Separator 5579311
 
12.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5334109
16.7%
r 5023527
15.8%
t 4533104
14.2%
n 3552695
11.2%
a 2130218
 
6.7%
o 1987298
 
6.2%
u 1450923
 
4.6%
i 1032229
 
3.2%
d 1011127
 
3.2%
f 927361
 
2.9%
Other values (9) 4872324
15.3%
Uppercase Letter
ValueCountFrequency (%)
F 1421702
17.5%
R 1256046
15.4%
B 1145038
14.1%
L 922217
11.3%
C 757068
9.3%
E 757068
9.3%
P 613323
7.5%
Q 613323
7.5%
D 306329
 
3.8%
S 248979
 
3.1%
Other values (4) 104688
 
1.3%
Space Separator
ValueCountFrequency (%)
5579311
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 40000696
87.8%
Common 5579311
 
12.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5334109
13.3%
r 5023527
12.6%
t 4533104
 
11.3%
n 3552695
 
8.9%
a 2130218
 
5.3%
o 1987298
 
5.0%
u 1450923
 
3.6%
F 1421702
 
3.6%
R 1256046
 
3.1%
B 1145038
 
2.9%
Other values (23) 12166036
30.4%
Common
ValueCountFrequency (%)
5579311
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45580007
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5579311
12.2%
e 5334109
11.7%
r 5023527
 
11.0%
t 4533104
 
9.9%
n 3552695
 
7.8%
a 2130218
 
4.7%
o 1987298
 
4.4%
u 1450923
 
3.2%
F 1421702
 
3.1%
R 1256046
 
2.8%
Other values (24) 13311074
29.2%

VEHICLE_DAMAGE
Categorical

MISSING 

Distinct19
Distinct (%)< 0.1%
Missing1733402
Missing (%)40.6%
Memory size32.6 MiB
Center Front End
398007 
Left Front Bumper
267625 
Center Back End
264288 
Right Front Bumper
245725 
No Damage
241879 
Other values (14)
1123261 

Length

Max length25
Median length23
Mean length17.099184
Min length4

Characters and Unicode

Total characters43445349
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLeft Front Quarter Panel
2nd rowRight Front Bumper
3rd rowLeft Front Quarter Panel
4th rowCenter Front End
5th rowRight Rear Bumper

Common Values

ValueCountFrequency (%)
Center Front End 398007
 
9.3%
Left Front Bumper 267625
 
6.3%
Center Back End 264288
 
6.2%
Right Front Bumper 245725
 
5.7%
No Damage 241879
 
5.7%
Left Front Quarter Panel 176340
 
4.1%
Right Front Quarter Panel 171346
 
4.0%
Left Rear Quarter Panel 140224
 
3.3%
Left Side Doors 139861
 
3.3%
Left Rear Bumper 128920
 
3.0%
Other values (9) 366570
 
8.6%
(Missing) 1733402
40.6%

Length

2024-10-29T15:12:22.508356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
front 1259043
16.1%
left 852970
10.9%
bumper 731608
9.3%
right 718708
9.2%
center 662295
8.5%
end 662295
8.5%
quarter 583751
7.5%
panel 583751
7.5%
rear 454323
 
5.8%
back 264288
 
3.4%
Other values (10) 1061329
13.5%

Most occurring characters

ValueCountFrequency (%)
5293576
12.2%
e 5098545
 
11.7%
r 4596523
 
10.6%
t 4127243
 
9.5%
n 3172571
 
7.3%
a 2377016
 
5.5%
o 2028301
 
4.7%
u 1318286
 
3.0%
F 1259043
 
2.9%
R 1178200
 
2.7%
Other values (24) 12996045
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30317412
69.8%
Uppercase Letter 7834361
 
18.0%
Space Separator 5293576
 
12.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5098545
16.8%
r 4596523
15.2%
t 4127243
13.6%
n 3172571
10.5%
a 2377016
7.8%
o 2028301
 
6.7%
u 1318286
 
4.3%
i 984315
 
3.2%
m 977890
 
3.2%
g 962847
 
3.2%
Other values (9) 4673875
15.4%
Uppercase Letter
ValueCountFrequency (%)
F 1259043
16.1%
R 1178200
15.0%
B 995896
12.7%
L 852970
10.9%
C 662295
8.5%
E 662295
8.5%
Q 583751
7.5%
P 583751
7.5%
D 502601
 
6.4%
S 256319
 
3.3%
Other values (4) 297240
 
3.8%
Space Separator
ValueCountFrequency (%)
5293576
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 38151773
87.8%
Common 5293576
 
12.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5098545
13.4%
r 4596523
 
12.0%
t 4127243
 
10.8%
n 3172571
 
8.3%
a 2377016
 
6.2%
o 2028301
 
5.3%
u 1318286
 
3.5%
F 1259043
 
3.3%
R 1178200
 
3.1%
B 995896
 
2.6%
Other values (23) 12000149
31.5%
Common
ValueCountFrequency (%)
5293576
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43445349
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5293576
12.2%
e 5098545
 
11.7%
r 4596523
 
10.6%
t 4127243
 
9.5%
n 3172571
 
7.3%
a 2377016
 
5.5%
o 2028301
 
4.7%
u 1318286
 
3.0%
F 1259043
 
2.9%
R 1178200
 
2.7%
Other values (24) 12996045
29.9%

VEHICLE_DAMAGE_1
Categorical

MISSING 

Distinct19
Distinct (%)< 0.1%
Missing2633928
Missing (%)61.6%
Memory size32.6 MiB
No Damage
456189 
Left Front Bumper
163154 
Center Front End
155283 
Right Front Bumper
130616 
Left Front Quarter Panel
103732 
Other values (14)
631285 

Length

Max length25
Median length23
Mean length15.69047
Min length4

Characters and Unicode

Total characters25736435
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRight Front Quarter Panel
2nd rowNo Damage
3rd rowCenter Back End
4th rowLeft Rear Quarter Panel
5th rowRight Front Quarter Panel

Common Values

ValueCountFrequency (%)
No Damage 456189
 
10.7%
Left Front Bumper 163154
 
3.8%
Center Front End 155283
 
3.6%
Right Front Bumper 130616
 
3.1%
Left Front Quarter Panel 103732
 
2.4%
Right Front Quarter Panel 95061
 
2.2%
Left Rear Bumper 85759
 
2.0%
Right Rear Bumper 79907
 
1.9%
Left Rear Quarter Panel 73990
 
1.7%
Left Side Doors 73733
 
1.7%
Other values (9) 222835
 
5.2%
(Missing) 2633928
61.6%

Length

2024-10-29T15:12:22.570973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
front 647846
13.7%
left 500368
10.6%
bumper 459436
9.7%
no 456189
9.7%
damage 456189
9.7%
right 427964
9.1%
quarter 330465
7.0%
panel 330465
7.0%
rear 297338
6.3%
end 220874
 
4.7%
Other values (10) 598191
12.7%

Most occurring characters

ValueCountFrequency (%)
3085066
12.0%
e 2993938
 
11.6%
r 2461376
 
9.6%
t 2154601
 
8.4%
a 1941534
 
7.5%
n 1423786
 
5.5%
o 1387831
 
5.4%
m 918541
 
3.6%
g 886595
 
3.4%
u 791186
 
3.1%
Other values (24) 7691981
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17926044
69.7%
Uppercase Letter 4725325
 
18.4%
Space Separator 3085066
 
12.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2993938
16.7%
r 2461376
13.7%
t 2154601
12.0%
a 1941534
10.8%
n 1423786
7.9%
o 1387831
7.7%
m 918541
 
5.1%
g 886595
 
4.9%
u 791186
 
4.4%
i 572166
 
3.2%
Other values (9) 2394490
13.4%
Uppercase Letter
ValueCountFrequency (%)
R 727311
15.4%
F 647846
13.7%
D 597536
12.6%
B 525027
11.1%
L 500368
10.6%
N 456189
9.7%
P 330465
7.0%
Q 330465
7.0%
C 220874
 
4.7%
E 220874
 
4.7%
Other values (4) 168370
 
3.6%
Space Separator
ValueCountFrequency (%)
3085066
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22651369
88.0%
Common 3085066
 
12.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2993938
13.2%
r 2461376
 
10.9%
t 2154601
 
9.5%
a 1941534
 
8.6%
n 1423786
 
6.3%
o 1387831
 
6.1%
m 918541
 
4.1%
g 886595
 
3.9%
u 791186
 
3.5%
R 727311
 
3.2%
Other values (23) 6964670
30.7%
Common
ValueCountFrequency (%)
3085066
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25736435
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3085066
12.0%
e 2993938
 
11.6%
r 2461376
 
9.6%
t 2154601
 
8.4%
a 1941534
 
7.5%
n 1423786
 
5.5%
o 1387831
 
5.4%
m 918541
 
3.6%
g 886595
 
3.4%
u 791186
 
3.1%
Other values (24) 7691981
29.9%

VEHICLE_DAMAGE_2
Categorical

MISSING 

Distinct19
Distinct (%)< 0.1%
Missing3034451
Missing (%)71.0%
Memory size32.6 MiB
No Damage
586304 
Right Front Bumper
123314 
Left Front Bumper
73650 
Center Front End
62140 
Left Rear Bumper
60839 
Other values (14)
333489 

Length

Max length25
Median length24
Mean length13.722874
Min length4

Characters and Unicode

Total characters17012741
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Damage
2nd rowLeft Rear Bumper
3rd rowRight Front Bumper
4th rowNo Damage
5th rowNo Damage

Common Values

ValueCountFrequency (%)
No Damage 586304
 
13.7%
Right Front Bumper 123314
 
2.9%
Left Front Bumper 73650
 
1.7%
Center Front End 62140
 
1.5%
Left Rear Bumper 60839
 
1.4%
Left Front Quarter Panel 46624
 
1.1%
Right Rear Bumper 43800
 
1.0%
Right Front Quarter Panel 40766
 
1.0%
Left Rear Quarter Panel 39020
 
0.9%
Right Rear Quarter Panel 38483
 
0.9%
Other values (9) 124796
 
2.9%
(Missing) 3034451
71.0%

Length

2024-10-29T15:12:22.634277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
no 586304
18.1%
damage 586304
18.1%
front 346494
10.7%
bumper 301603
9.3%
right 273192
8.5%
left 251380
7.8%
rear 182142
 
5.6%
quarter 164893
 
5.1%
panel 164893
 
5.1%
end 95203
 
2.9%
Other values (10) 278075
8.6%

Most occurring characters

ValueCountFrequency (%)
1990747
11.7%
e 1936980
11.4%
a 1722268
 
10.1%
r 1349248
 
7.9%
t 1159145
 
6.8%
o 1053551
 
6.2%
m 889792
 
5.2%
g 861734
 
5.1%
n 704892
 
4.1%
D 646265
 
3.8%
Other values (24) 4698119
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11791511
69.3%
Uppercase Letter 3230483
 
19.0%
Space Separator 1990747
 
11.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1936980
16.4%
a 1722268
14.6%
r 1349248
11.4%
t 1159145
9.8%
o 1053551
8.9%
m 889792
7.5%
g 861734
7.3%
n 704892
 
6.0%
u 467357
 
4.0%
i 335584
 
2.8%
Other values (9) 1310960
11.1%
Uppercase Letter
ValueCountFrequency (%)
D 646265
20.0%
N 586304
18.1%
R 456692
14.1%
F 346494
10.7%
B 334666
10.4%
L 251380
 
7.8%
Q 164893
 
5.1%
P 164893
 
5.1%
C 95203
 
2.9%
E 95203
 
2.9%
Other values (4) 88490
 
2.7%
Space Separator
ValueCountFrequency (%)
1990747
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15021994
88.3%
Common 1990747
 
11.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1936980
12.9%
a 1722268
11.5%
r 1349248
 
9.0%
t 1159145
 
7.7%
o 1053551
 
7.0%
m 889792
 
5.9%
g 861734
 
5.7%
n 704892
 
4.7%
D 646265
 
4.3%
N 586304
 
3.9%
Other values (23) 4111815
27.4%
Common
ValueCountFrequency (%)
1990747
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17012741
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1990747
11.7%
e 1936980
11.4%
a 1722268
 
10.1%
r 1349248
 
7.9%
t 1159145
 
6.8%
o 1053551
 
6.2%
m 889792
 
5.2%
g 861734
 
5.1%
n 704892
 
4.1%
D 646265
 
3.8%
Other values (24) 4698119
27.6%

VEHICLE_DAMAGE_3
Categorical

IMBALANCE  MISSING 

Distinct19
Distinct (%)< 0.1%
Missing3320488
Missing (%)77.7%
Memory size32.6 MiB
No Damage
675522 
Center Front End
 
33747
Other
 
32688
Right Front Bumper
 
27348
Left Front Bumper
 
26560
Other values (14)
157834 

Length

Max length25
Median length9
Mean length11.337945
Min length4

Characters and Unicode

Total characters10812987
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Damage
2nd rowNo Damage
3rd rowNo Damage
4th rowNo Damage
5th rowNo Damage

Common Values

ValueCountFrequency (%)
No Damage 675522
 
15.8%
Center Front End 33747
 
0.8%
Other 32688
 
0.8%
Right Front Bumper 27348
 
0.6%
Left Front Bumper 26560
 
0.6%
Left Front Quarter Panel 25385
 
0.6%
Right Front Quarter Panel 22265
 
0.5%
Center Back End 18604
 
0.4%
Left Rear Quarter Panel 16243
 
0.4%
Left Rear Bumper 15872
 
0.4%
Other values (9) 59465
 
1.4%
(Missing) 3320488
77.7%

Length

2024-10-29T15:12:22.699428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
no 675522
31.0%
damage 675522
31.0%
front 135305
 
6.2%
left 96841
 
4.4%
right 87386
 
4.0%
bumper 82125
 
3.8%
quarter 78128
 
3.6%
panel 78128
 
3.6%
rear 58695
 
2.7%
end 52351
 
2.4%
Other values (10) 160502
 
7.4%

Most occurring characters

ValueCountFrequency (%)
a 1592142
14.7%
e 1245686
11.5%
1226806
11.3%
o 864223
8.0%
g 766590
 
7.1%
m 760335
 
7.0%
D 702184
 
6.5%
N 675522
 
6.2%
r 554762
 
5.1%
t 483681
 
4.5%
Other values (24) 1941056
18.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7405676
68.5%
Uppercase Letter 2180505
 
20.2%
Space Separator 1226806
 
11.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1592142
21.5%
e 1245686
16.8%
o 864223
11.7%
g 766590
10.4%
m 760335
10.3%
r 554762
 
7.5%
t 483681
 
6.5%
n 322799
 
4.4%
u 161235
 
2.2%
h 122762
 
1.7%
Other values (9) 531461
 
7.2%
Uppercase Letter
ValueCountFrequency (%)
D 702184
32.2%
N 675522
31.0%
R 147461
 
6.8%
F 135305
 
6.2%
B 100729
 
4.6%
L 96841
 
4.4%
Q 78128
 
3.6%
P 78128
 
3.6%
E 52351
 
2.4%
C 52351
 
2.4%
Other values (4) 61505
 
2.8%
Space Separator
ValueCountFrequency (%)
1226806
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9586181
88.7%
Common 1226806
 
11.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1592142
16.6%
e 1245686
13.0%
o 864223
9.0%
g 766590
8.0%
m 760335
7.9%
D 702184
7.3%
N 675522
7.0%
r 554762
 
5.8%
t 483681
 
5.0%
n 322799
 
3.4%
Other values (23) 1618257
16.9%
Common
ValueCountFrequency (%)
1226806
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10812987
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1592142
14.7%
e 1245686
11.5%
1226806
11.3%
o 864223
8.0%
g 766590
 
7.1%
m 760335
 
7.0%
D 702184
 
6.5%
N 675522
 
6.2%
r 554762
 
5.1%
t 483681
 
4.5%
Other values (24) 1941056
18.0%

PUBLIC_PROPERTY_DAMAGE
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing1528858
Missing (%)35.8%
Memory size32.6 MiB
N
2397062 
Unspecified
332361 
Y
 
15906

Length

Max length11
Median length1
Mean length2.2106418
Min length1

Characters and Unicode

Total characters6068939
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowN
2nd rowN
3rd rowN
4th rowN
5th rowN

Common Values

ValueCountFrequency (%)
N 2397062
56.1%
Unspecified 332361
 
7.8%
Y 15906
 
0.4%
(Missing) 1528858
35.8%

Length

2024-10-29T15:12:22.762461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-29T15:12:22.812107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
n 2397062
87.3%
unspecified 332361
 
12.1%
y 15906
 
0.6%

Most occurring characters

ValueCountFrequency (%)
N 2397062
39.5%
e 664722
 
11.0%
i 664722
 
11.0%
U 332361
 
5.5%
n 332361
 
5.5%
s 332361
 
5.5%
p 332361
 
5.5%
c 332361
 
5.5%
f 332361
 
5.5%
d 332361
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3323610
54.8%
Uppercase Letter 2745329
45.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 664722
20.0%
i 664722
20.0%
n 332361
10.0%
s 332361
10.0%
p 332361
10.0%
c 332361
10.0%
f 332361
10.0%
d 332361
10.0%
Uppercase Letter
ValueCountFrequency (%)
N 2397062
87.3%
U 332361
 
12.1%
Y 15906
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 6068939
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 2397062
39.5%
e 664722
 
11.0%
i 664722
 
11.0%
U 332361
 
5.5%
n 332361
 
5.5%
s 332361
 
5.5%
p 332361
 
5.5%
c 332361
 
5.5%
f 332361
 
5.5%
d 332361
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6068939
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 2397062
39.5%
e 664722
 
11.0%
i 664722
 
11.0%
U 332361
 
5.5%
n 332361
 
5.5%
s 332361
 
5.5%
p 332361
 
5.5%
c 332361
 
5.5%
f 332361
 
5.5%
d 332361
 
5.5%
Distinct19998
Distinct (%)72.9%
Missing4246765
Missing (%)99.4%
Memory size32.6 MiB
2024-10-29T15:12:22.956323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length866
Median length383
Mean length38.235468
Min length1

Characters and Unicode

Total characters1048493
Distinct characters61
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18975 ?
Unique (%)69.2%

Sample

1st rowUTILITY POLE
2nd rowPASSENGER FRONT SIDE DAMAGED
3rd rowFENCE OF A SCHOOL IN THE BACK
4th rowPOWERLINE CABLES IN FRONT OF 4236 BEDFORD AVENUE
5th rowBRICK FENCE WAS STRUCK BY MV1 WHEN TRYING TO PARK.
ValueCountFrequency (%)
fence 6459
 
3.6%
of 6198
 
3.4%
and 4957
 
2.7%
to 4591
 
2.5%
the 4114
 
2.3%
pole 3913
 
2.2%
damage 3530
 
2.0%
front 3083
 
1.7%
vehicle 2797
 
1.5%
light 2567
 
1.4%
Other values (12108) 138618
76.7%
2024-10-29T15:12:23.208567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
153405
14.6%
E 102434
 
9.8%
A 71928
 
6.9%
T 68243
 
6.5%
O 64902
 
6.2%
N 64037
 
6.1%
I 57625
 
5.5%
R 55324
 
5.3%
D 44189
 
4.2%
L 41926
 
4.0%
Other values (51) 324480
30.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 848736
80.9%
Space Separator 153405
 
14.6%
Decimal Number 27887
 
2.7%
Other Punctuation 15234
 
1.5%
Dash Punctuation 1815
 
0.2%
Open Punctuation 651
 
0.1%
Close Punctuation 650
 
0.1%
Currency Symbol 85
 
< 0.1%
Math Symbol 17
 
< 0.1%
Control 11
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 102434
12.1%
A 71928
 
8.5%
T 68243
 
8.0%
O 64902
 
7.6%
N 64037
 
7.5%
I 57625
 
6.8%
R 55324
 
6.5%
D 44189
 
5.2%
L 41926
 
4.9%
S 41423
 
4.9%
Other values (16) 236705
27.9%
Other Punctuation
ValueCountFrequency (%)
. 8262
54.2%
, 2878
 
18.9%
/ 1968
 
12.9%
# 941
 
6.2%
' 550
 
3.6%
& 218
 
1.4%
: 138
 
0.9%
@ 115
 
0.8%
? 73
 
0.5%
; 67
 
0.4%
Other values (2) 24
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 6725
24.1%
2 4150
14.9%
0 3314
11.9%
3 2584
 
9.3%
5 2248
 
8.1%
4 2212
 
7.9%
6 1769
 
6.3%
8 1681
 
6.0%
7 1671
 
6.0%
9 1533
 
5.5%
Math Symbol
ValueCountFrequency (%)
= 8
47.1%
+ 7
41.2%
~ 1
 
5.9%
> 1
 
5.9%
Open Punctuation
ValueCountFrequency (%)
( 650
99.8%
[ 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 649
99.8%
] 1
 
0.2%
Space Separator
ValueCountFrequency (%)
153405
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1815
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 85
100.0%
Control
ValueCountFrequency (%)
 11
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 848736
80.9%
Common 199757
 
19.1%

Most frequent character per script

Common
ValueCountFrequency (%)
153405
76.8%
. 8262
 
4.1%
1 6725
 
3.4%
2 4150
 
2.1%
0 3314
 
1.7%
, 2878
 
1.4%
3 2584
 
1.3%
5 2248
 
1.1%
4 2212
 
1.1%
/ 1968
 
1.0%
Other values (25) 12011
 
6.0%
Latin
ValueCountFrequency (%)
E 102434
12.1%
A 71928
 
8.5%
T 68243
 
8.0%
O 64902
 
7.6%
N 64037
 
7.5%
I 57625
 
6.8%
R 55324
 
6.5%
D 44189
 
5.2%
L 41926
 
4.9%
S 41423
 
4.9%
Other values (16) 236705
27.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1048493
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
153405
14.6%
E 102434
 
9.8%
A 71928
 
6.9%
T 68243
 
6.5%
O 64902
 
6.2%
N 64037
 
6.1%
I 57625
 
5.5%
R 55324
 
5.3%
D 44189
 
4.2%
L 41926
 
4.0%
Other values (51) 324480
30.9%

CONTRIBUTING_FACTOR_1
Text

MISSING 

Distinct61
Distinct (%)< 0.1%
Missing153529
Missing (%)3.6%
Memory size32.6 MiB
2024-10-29T15:12:23.319953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length11
Mean length16.335901
Min length1

Characters and Unicode

Total characters67314662
Distinct characters55
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnspecified
2nd rowDriver Inattention/Distraction
3rd rowDriver Inattention/Distraction
4th rowUnspecified
5th rowOther Vehicular
ValueCountFrequency (%)
unspecified 2417427
36.3%
driver 567912
 
8.5%
inattention/distraction 527014
 
7.9%
too 198897
 
3.0%
closely 198897
 
3.0%
to 175570
 
2.6%
failure 152517
 
2.3%
yield 145302
 
2.2%
right-of-way 145302
 
2.2%
following 136420
 
2.0%
Other values (96) 2001668
30.0%
2024-10-29T15:12:23.496646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 8755933
13.0%
e 8221225
 
12.2%
n 5913284
 
8.8%
s 4151947
 
6.2%
t 3541043
 
5.3%
c 3495358
 
5.2%
r 3022883
 
4.5%
o 2950811
 
4.4%
d 2910010
 
4.3%
f 2856966
 
4.2%
Other values (45) 21495202
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 56569759
84.0%
Uppercase Letter 7231582
 
10.7%
Space Separator 2546268
 
3.8%
Other Punctuation 668620
 
1.0%
Dash Punctuation 293059
 
0.4%
Open Punctuation 2553
 
< 0.1%
Close Punctuation 2553
 
< 0.1%
Decimal Number 268
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 8755933
15.5%
e 8221225
14.5%
n 5913284
10.5%
s 4151947
 
7.3%
t 3541043
 
6.3%
c 3495358
 
6.2%
r 3022883
 
5.3%
o 2950811
 
5.2%
d 2910010
 
5.1%
f 2856966
 
5.1%
Other values (15) 10750299
19.0%
Uppercase Letter
ValueCountFrequency (%)
U 2689907
37.2%
D 1276856
17.7%
I 741471
 
10.3%
F 358478
 
5.0%
C 351725
 
4.9%
T 312161
 
4.3%
P 231120
 
3.2%
R 201358
 
2.8%
O 174903
 
2.4%
L 168002
 
2.3%
Other values (12) 725601
 
10.0%
Decimal Number
ValueCountFrequency (%)
8 126
47.0%
0 126
47.0%
1 16
 
6.0%
Space Separator
ValueCountFrequency (%)
2546268
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 668620
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 293059
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2553
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2553
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 63801341
94.8%
Common 3513321
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 8755933
13.7%
e 8221225
12.9%
n 5913284
 
9.3%
s 4151947
 
6.5%
t 3541043
 
5.6%
c 3495358
 
5.5%
r 3022883
 
4.7%
o 2950811
 
4.6%
d 2910010
 
4.6%
f 2856966
 
4.5%
Other values (37) 17981881
28.2%
Common
ValueCountFrequency (%)
2546268
72.5%
/ 668620
 
19.0%
- 293059
 
8.3%
( 2553
 
0.1%
) 2553
 
0.1%
8 126
 
< 0.1%
0 126
 
< 0.1%
1 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 67314662
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 8755933
13.0%
e 8221225
 
12.2%
n 5913284
 
8.8%
s 4151947
 
6.2%
t 3541043
 
5.3%
c 3495358
 
5.2%
r 3022883
 
4.5%
o 2950811
 
4.4%
d 2910010
 
4.3%
f 2856966
 
4.2%
Other values (45) 21495202
31.9%

CONTRIBUTING_FACTOR_2
Text

MISSING 

Distinct56
Distinct (%)< 0.1%
Missing1694521
Missing (%)39.6%
Memory size32.6 MiB
2024-10-29T15:12:23.591021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length11
Mean length13.768191
Min length1

Characters and Unicode

Total characters35517335
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnspecified
2nd rowUnsafe Lane Changing
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 2019132
57.7%
driver 178999
 
5.1%
inattention/distraction 144627
 
4.1%
too 88000
 
2.5%
closely 88000
 
2.5%
lane 63931
 
1.8%
passing 61819
 
1.8%
following 60394
 
1.7%
unsafe 58227
 
1.7%
to 53054
 
1.5%
Other values (94) 682176
 
19.5%
2024-10-29T15:12:23.753364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5280674
14.9%
i 5225177
14.7%
n 3155455
8.9%
s 2584018
 
7.3%
c 2320563
 
6.5%
p 2216799
 
6.2%
d 2186188
 
6.2%
f 2182232
 
6.1%
U 2133085
 
6.0%
r 979534
 
2.8%
Other values (43) 7253610
20.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30684861
86.4%
Uppercase Letter 3638339
 
10.2%
Space Separator 918693
 
2.6%
Other Punctuation 184068
 
0.5%
Dash Punctuation 88576
 
0.2%
Open Punctuation 1380
 
< 0.1%
Close Punctuation 1380
 
< 0.1%
Decimal Number 38
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5280674
17.2%
i 5225177
17.0%
n 3155455
10.3%
s 2584018
8.4%
c 2320563
7.6%
p 2216799
7.2%
d 2186188
7.1%
f 2182232
7.1%
r 979534
 
3.2%
o 973781
 
3.2%
Other values (15) 3580440
11.7%
Uppercase Letter
ValueCountFrequency (%)
U 2133085
58.6%
D 364080
 
10.0%
I 243450
 
6.7%
C 145907
 
4.0%
T 128054
 
3.5%
F 111111
 
3.1%
P 88189
 
2.4%
L 73424
 
2.0%
R 69315
 
1.9%
O 51544
 
1.4%
Other values (12) 230180
 
6.3%
Space Separator
ValueCountFrequency (%)
918693
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 184068
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 88576
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1380
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1380
100.0%
Decimal Number
ValueCountFrequency (%)
1 38
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34323200
96.6%
Common 1194135
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5280674
15.4%
i 5225177
15.2%
n 3155455
9.2%
s 2584018
7.5%
c 2320563
 
6.8%
p 2216799
 
6.5%
d 2186188
 
6.4%
f 2182232
 
6.4%
U 2133085
 
6.2%
r 979534
 
2.9%
Other values (37) 6059475
17.7%
Common
ValueCountFrequency (%)
918693
76.9%
/ 184068
 
15.4%
- 88576
 
7.4%
( 1380
 
0.1%
) 1380
 
0.1%
1 38
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35517335
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5280674
14.9%
i 5225177
14.7%
n 3155455
8.9%
s 2584018
 
7.3%
c 2320563
 
6.5%
p 2216799
 
6.2%
d 2186188
 
6.2%
f 2182232
 
6.1%
U 2133085
 
6.0%
r 979534
 
2.8%
Other values (43) 7253610
20.4%

Interactions

2024-10-29T15:11:43.866720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:40.578189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:41.815966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:42.914678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:44.117186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:40.911134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:42.135313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:43.148319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:44.355208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:41.187192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:42.398649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:43.375633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:44.574064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:41.458939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:42.674049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-29T15:11:43.619493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-10-29T15:11:46.466581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-10-29T15:11:52.128749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-10-29T15:12:11.982457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

UNIQUE_IDCOLLISION_IDCRASH_DATECRASH_TIMEVEHICLE_IDSTATE_REGISTRATIONVEHICLE_TYPEVEHICLE_MAKEVEHICLE_MODELVEHICLE_YEARTRAVEL_DIRECTIONVEHICLE_OCCUPANTSDRIVER_SEXDRIVER_LICENSE_STATUSDRIVER_LICENSE_JURISDICTIONPRE_CRASHPOINT_OF_IMPACTVEHICLE_DAMAGEVEHICLE_DAMAGE_1VEHICLE_DAMAGE_2VEHICLE_DAMAGE_3PUBLIC_PROPERTY_DAMAGEPUBLIC_PROPERTY_DAMAGE_TYPECONTRIBUTING_FACTOR_1CONTRIBUTING_FACTOR_2
01038578010020109/07/20129:031NYPASSENGER VEHICLENaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNUnspecifiedNaN
119140702421308209/23/20198:150553ab4d-9500-4cba-8d98-f4d7f89d5856NYStation Wagon/Sport Utility VehicleTOYT -CAR/SUVNaN2002.0North1.0MLicensedNYGoing Straight AheadLeft Front BumperLeft Front Quarter PanelNaNNaNNaNNNaNDriver Inattention/DistractionUnspecified
214887647330760810/02/201517:182NYTAXINaNNaNNaNNaNNaNNaNNaNNaNGoing Straight AheadNaNNaNNaNNaNNaNNaNNaNDriver Inattention/DistractionNaN
314889754330869310/04/201520:341NYPASSENGER VEHICLENaNNaNNaNNaNNaNNaNNaNNaNParkedNaNNaNNaNNaNNaNNaNNaNUnspecifiedNaN
41440027029766604/25/201321:151NYPASSENGER VEHICLENaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNOther VehicularNaN
517044639343415505/02/201617:35219456NY4 dr sedanMERZ -CAR/SUVNaN2015.0East2.0MLicensedFLMergingRight Front BumperRight Front BumperRight Front Quarter PanelNaNNaNNNaNDriver Inattention/DistractionUnsafe Lane Changing
619138701422906710/24/201913:15c53b43d9-419a-4ab1-9361-3f2979078d89NYBusFRHT-TRUCK/BUSNaN2006.0East13.0MLicensedNYParkedLeft Front Quarter PanelLeft Front Quarter PanelNaNNaNNaNNNaNUnspecifiedUnspecified
717303317350302708/18/201612:39672828NYStation Wagon/Sport Utility VehicleFORD -CAR/SUVNaN2005.0Southwest2.0FLicensedNYGoing Straight AheadCenter Front EndCenter Front EndNo DamageNo DamageNo DamageNNaNDriver Inattention/DistractionUnspecified
81225453619642507/16/201311:201NYPASSENGER VEHICLENaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNUnspecifiedNaN
911804847297589711/26/201218:122NYPASSENGER VEHICLENaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNDriver Inattention/DistractionNaN
UNIQUE_IDCOLLISION_IDCRASH_DATECRASH_TIMEVEHICLE_IDSTATE_REGISTRATIONVEHICLE_TYPEVEHICLE_MAKEVEHICLE_MODELVEHICLE_YEARTRAVEL_DIRECTIONVEHICLE_OCCUPANTSDRIVER_SEXDRIVER_LICENSE_STATUSDRIVER_LICENSE_JURISDICTIONPRE_CRASHPOINT_OF_IMPACTVEHICLE_DAMAGEVEHICLE_DAMAGE_1VEHICLE_DAMAGE_2VEHICLE_DAMAGE_3PUBLIC_PROPERTY_DAMAGEPUBLIC_PROPERTY_DAMAGE_TYPECONTRIBUTING_FACTOR_1CONTRIBUTING_FACTOR_2
427417720771048476597910/21/20246:50319cefce-1311-4997-bc22-274b8d6659d9NYSedanHOND -CAR/SUVNaN2020.0North1.0MLicensedNYGoing Straight AheadCenter Front EndCenter Front EndLeft Front BumperRight Front BumperNaNNNaNAlcohol InvolvementUnspecified
427417820770066476570210/22/202422:4543146b73-e661-41e5-82be-02c5395de86eNYSedanBMW -CAR/SUVNaN2015.0North1.0FLicensedNYGoing Straight AheadCenter Front EndCenter Front EndLeft Front BumperRight Front BumperNaNNNaNDriver Inattention/DistractionUnspecified
427417920771061476599310/22/202415:21e5f99a9d-bcaa-4f4e-8e14-474e70d59554NMMotorcycleNaNNaN2023.0North1.0MUnlicensedNaNGoing Straight AheadLeft Front Quarter PanelLeft Front Quarter PanelRight Side DoorsRight Rear Quarter PanelNaNNNaNDriver InexperiencePavement Slippery
427418020770574476593210/21/202411:56c323f98d-4868-4fb3-8a36-5960a2148876NYSedanTOYT -CAR/SUVNaN2022.0EastNaNNaNNaNNaNMaking Left TurnLeft Front BumperNo DamageNo DamageNo DamageNo DamageNNaNDriver Inattention/DistractionFailure to Yield Right-of-Way
427418120769494476553010/22/20248:45f1c9967b-2f3b-4731-af57-6c4813e39580NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNUnspecifiedSCAFFOLDINGNaNNaN
427418220768968476528310/22/20240:028fb05f7d-c2e5-4f93-a575-26b44434bd8bNaNPick-up TruckTOYT -CAR/SUVNaNNaNWestNaNNaNNaNNaNGoing Straight AheadLeft Front Quarter PanelNaNNaNNaNNaNUnspecifiedNaNAggressive Driving/Road RageUnsafe Speed
427418320770070476566010/22/202415:509a851c1b-4eb2-4e0e-a29f-e2357e16a5b8NYStation Wagon/Sport Utility VehicleLNDR -CAR/SUVNaN2015.0West1.0FLicensedNYGoing Straight AheadLeft Front BumperNo DamageNaNNaNNaNNNaNUnspecifiedUnspecified
427418420770400476576210/22/202414:3045cf331d-219e-4298-be44-fff4bd0074dcMDStation Wagon/Sport Utility VehicleJEEP -CAR/SUVNaN2020.0North1.0NaNNaNNaNGoing Straight AheadCenter Front EndOtherNaNNaNNaNNNaNDriver Inattention/DistractionUnspecified
427418520769640476535210/22/20247:210c3b734e-2ca0-4592-bea5-2bbc35cce0d2NYBusNaNNaNNaNWest5.0FLicensedNYGoing Straight AheadCenter Front EndNo DamageNo DamageNo DamageNo DamageNNaNFollowing Too CloselyUnspecified
427418620770629476602410/09/202419:2281045819-f07d-4333-adf9-698d0edb85a4NYSedanTOYT -CAR/SUVNaN2010.0East3.0MLicensedNYGoing Straight AheadLeft Rear BumperCenter Back EndLeft Rear Quarter PanelNaNNaNNNaNPassing or Lane Usage ImproperUnspecified